30 research outputs found
Slope heuristics and V-Fold model selection in heteroscedastic regression using strongly localized bases
We investigate the optimality for model selection of the so-called slope
heuristics, -fold cross-validation and -fold penalization in a
heteroscedastic with random design regression context. We consider a new class
of linear models that we call strongly localized bases and that generalize
histograms, piecewise polynomials and compactly supported wavelets. We derive
sharp oracle inequalities that prove the asymptotic optimality of the slope
heuristics---when the optimal penalty shape is known---and -fold
penalization. Furthermore, -fold cross-validation seems to be suboptimal for
a fixed value of since it recovers asymptotically the oracle learned from a
sample size equal to of the original amount of data. Our results are
based on genuine concentration inequalities for the true and empirical excess
risks that are of independent interest. We show in our experiments the good
behavior of the slope heuristics for the selection of linear wavelet models.
Furthermore, -fold cross-validation and -fold penalization have
comparable efficiency
Nonasymptotic quasi-optimality of AIC and the slope heuristics in maximum likelihood estimation of density using histogram models
48 p.We consider nonparametric maximum likelihood estimation of density using linear histogram models. More precisely, we investigate optimality of model selection procedures via penalization, when the number of models is polynomial in the number of data. It turns out that the Slope Heuristics rst formulated by Birgé and Massart [10] is satised under rather mild conditions on the density to be estimated and the structure of the considered partitions, and that the minimal penalty is equivalent to half of AIC penalty
Optimal upper and lower bounds for the true and empirical excess risks in heteroscedastic least-squares regression
58p.We consider the estimation of a bounded regression function with nonparametric heteroscedastic noise. We are interested by the true and empirical excess risks of the least-squares estimator on a nite-dimensional vector space. For these quantities, we give upper and lower bounds in probability that are optimal at the rst order. Moreover, these bounds show the equivalence between the true and empirical excess risks when, among other things, the least-squares estimator is consistent in sup-norm towards the projection of the regression function onto the considered model. Consistency in sup-norm is then proved for suitable histogram models and more general models of piecewise polynomials that are endowed with a localized basis structure